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Preface 


The science of statistics can be of great use to geodosists and 
photogrammetrists in reducing and analyzing data obtained from meas= 
urements. This thesis will explore the applications of statistical 
hypothesis tests in the fields of geodesy. 

The data for examples have been taken from various studies in 
the field of geodetic science. In each case the source of the data 
4s noted, Often the conclusions drawn by statistical tests will not 
agree with the conclusions drawn by the original experimenter, It 
is not the intent of this thesis to criticize the work of others, 
but only to give examples of how statistical tests may be used to guide 
geodesists in drawing eee sien from observed data, 

The author gratefully achnowledges the guidance of Dr. Urho A, 
Uotila and the assistance and inspiration of his wife, Judy, in the 


preparation of this thesis, 
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CHAPTER 4 


INTRODUCTION 

In geodesy and photogrammetry, large quantities of data are 
collocted in the form of measuremonts such as angles, Jengths, and 
gravity values. It is imperative that the relevant information con- 
tained in a mass of geodetic data be expressed by comparatively few 
values. To accomplish this task, geodesists, as scientists in other 
fields of physical and social science, have found many solutions through 
the science of statistics. 

otatistics is the study of populations, variances, and methods 
of data reduction, Some statistical methods, notably the method of 
least squares, have found universal acceptance in geodesy, Other 
statistical methods are not so widely used, It is the purpose of this 
thesis to explore the applications of statistical hypothesis testing to 


the problems of geodesy, 


1,1 Statistical Theory 
A STATISTIC is a value calculated from an observed sample with 


a view to characterizing the population from which it is drawn, 


Statistics which will be of value to geodesy are: 


Population mean A 
Sample mean X = a 


True variance ea 





tT? 
ae 2 (X,-X) 
Variance estimate! S a 
Fisher's F Statisti F s 1? 
sher's atistic = 
Y ie 
Pearsen's X* Statistic = 
Student's t Statistic t 
The standard error S 
The estimate of standard error! S 


A statistic which, on the average, gives the right answer is 
said to be unbiased. If a statistic gives values which are concene- 
trated more closely to the right value, the statistic is said to be 
efficient. 

A widely used concept in statistical theory is a random 
variable. A random variable is a quantity which takes on a definite 
value at every point of a sample space. Geodetic measurements are cone 


sidered to be random variables,” 


1.2 Density and Distribution Functions 
Assume that a sample space is such that each point of the sample 
space can be characterized by the value of a continuous variable, x, 
which can take on all values between -c and +@ , If the event, A is 
defined as the set of all sample points characterized by the inequality 
Ko 
where x° is some fixed value, The cumulative probability distribution 


function, F(x°), and the probability density function, (x), are de- 
[The letter m is commonly used in peodetic Literature. In most statis- 


tical literature Greek letters represent true values and the Roman equiva- 
lent represents the statistical estimate of this value. For ease of 
understanding, the statistical convention will be followed. 


2see discussion by J.L. Steam (1964). 





fined by ¥? 
P(A) = F(x?) = | hoo, 
The density function, @(x), has the eaante tena Significance that 
P(x9 = x HZ x9 + dx) ~ (x?) dx. 
Since the total probability for a sample space must equal unity, @(x) 
must include a normalization factor a 
F(a) = | aoaes mle 
Zo 
The basic requirement for any probability density function is 
that this integral exists. A further requirement is that 
Sx) = QO for -PTL xgtowo, 
There are many distributions and density functions. As an 
example the uniform density function is defined 


A(x) = 1 for -as xfta 
2a 


and 
O(x) = 0 otherwise. 
Figure 1 shows the corresponding probability density function, 
(x), and the cumulative probability distribution function, F(x), for 


this uniform density function, 


PX) F CX) 
Za 
/ 
i 
-a a x ~a 4#Q@ x 


Figure 1] 








Ly 

A quantity is said to bo normally distributed when it takes all 
values from -@® to +@, with frequencies given by a definite matho- 
matical law, namoly, the logarithm of the frequency at any distance, d, 
fron the centor of the distribution is less than the lorarithm of the 
froquoncy at the centor by a quantity proportional tod*, The distri- 
bution is symmetric with pgreatest frequency at the center (Fisher, 1925). 
The density function of a normal distribution of mean 4 and variance ¢? 
is given by the expression 

eq ye ee Cc Zee 
Ory Ser 

Figure 2 shows the probability density function and the cumu- 
lative probability distribution function for the normal distribution, 
The scale of x can be changed by measuring each x value by its distance 
from the mean, and adopting the standard deviation 6 as a unit of meas= 
urement, An ordinate of this normal curve is then 

waz XH 
Co 

The quantity wis called the standard normal deviate (Mandel, 1964), 
A unit normal deviate is a distribution which has a mean of 0, and a 


variance of unity. 


PC CX) F¢(X) 


Figure 2 





J 
The normal distribution is very useful in statistics because of 
the Central Limit Theorem, This thcorcm may be expressed as follows; 
Given a population of values with a finite varianee, if inde- 
pendent samples are taken from this population, all of size N, then the 
population fomned by the averages of these samples will tend to have a 
normal distribution, regardless of what the distribution of the origi- 
nal population is; the larger N, the greater will be this tendency 
towards normality. 
Thus far only functions of one variable have been discussed. 
If there are more than one variable associated with each point in sample 
Space, multivariant funetions may be defined. For example, a multi- 
variant probability density function is defined as 
Cua s Oe eeen es X,°) dx) dx2 cee ee Ci 
@ P(xj°= eee Pocikpg dt LZ el). 
The right side of this equation may be interpreted as the probability 
that all the inequalities hold simultaneously. For an excellent discus- 


Sion, the reader is referred to pages 16 through 18 of Hamilton (1964), 


1,3 Statistical Hypotheses 


Webster defines a hypothesis as a tentative theory or suppo- 
Sition provisionally adopted to explain certain facts and to guide in 
the investigation of others (Webster, 1954). A statistical hypothesis 
is thus a theory about some population. 

The only way that one can be absolutely certain of the truth or 
falsity of a statistical hypothesis is to examine the entire population, 


since measurements can take on an infinite number of values, exami- 





6 
nation of the entire population in geodetic applications is impossible, 
One is then forced to make a decision based on a fow measuroments, 
Statistically spealing those measurements are a sample taken from tho 
population. The process of using this sample to test tho truth or 
falsity of a hypothesia is called statistical tests. Thero is in these 
tests no cortainty that a mistake has not been made, There aro, in fact, 
two different kinds of errors which can be made, These aro called: 


Type I (&) error -- tho rejection of a hypothesis 
which is true 


Type II (A) error -- the acceptance of a hypothesis 
which is false 


These errors will be discussed in detail later, 


Table i 


Definition of the Types of Errors Associated 
with the Tests of Statistical Hypotheses 






Truo Situation 


Accopt tho 
Hypothesis No Error | ypo II Error 


Reject the 
Hypothesis 









ye I Error No Error 
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CHAPTER 2 


HYPOTHESIS TESTING 


It may be of value to know if data are normally distributed. 
A simple test is to compare the histogram to a normal curve (Dixon, 
1957). The percentage of the data in a givon group of the histogram 
can be compared to the area under the normal curve corresponding to the 
given group, 

For example; givon the mean of a data sot as 30, and its stand- 
ard deviation of 5, 80% of the data is found betwoen 20 and 35, Is 
this data normally distributed? 


The standard normal deviates for the data group are 


X20 = 20-30 = -2.0 
X35 = 35-30 = 1.0. 
From Appendix I the area under the normal curve is 
fron OO to 1.0 = 0413 
from -OO0 to -2 = 1.0 - .9772 = Oeeo. 
The aroa under tke normal curve = og 


or 81.85%, This would indicate that the data in this group has a distrie 
bution close to a normal distribution. 

The x7 statistic, discussed in section 2.4, may also be used to 
test the distribution of a group of observations, A discussion of this 
test will be deferred until the statistic has been introduced, 

For an example of more elaborate tests for normal distribution 


the reader is referred to the paper by Stearn (1964), 
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2.1 Tests Involvin: the Nomal Distribution 





If a single observation, x, is made from a normally distributed 
population of mean, yw, and variance, 6“, Statistical theory states 
that (Hamilton, 1964) 


(1) we (x - 4) 


C 
has a normal distribution, The probability that the magnitude of w 


calculated in this manner exceeds some specified value 
(2) P(iw/ > wy ) 
where w, is the value of w to the right of which lies an area y under 


the probability density curve is 


(3) P(jwl > wy) =F. 
We can then write 
a=W, _w +0D Ra 
(4) PCiwl > wy) = _! J e = dw + fe 2 dw]. 
V21r |< KX 


v 
From Figure 3 we can interpret the value of the integral as the cross 


hatched area under the curve. 





Figure 3 


Mathematically Hamilton (1964), shows 


(5) P(iw] > w,) = 2FC = w,) 
= 2(1 - F(wy)) 
= 29 


The value of F(w, ) can be found from a table of the cumulative nomal 








distribution function (Appendix I). 


If wy is 1,96 
P(lw] > 1.96) = 2(1 - 0.975) = 0.05. 
Equation (5) can be expressed as 


(6) agree w,) = 20 = F(wy)) 
oO 


(7) PUM neue XL + Owy) = 1 - 2/1 - FCW) 
= 2F(w,) ~ 1. 
For the example 
P(M- 1.9662 x2X+t+ 1.965) = 0.95 
that is, the probability that a single observation, in the normal popu- 
lation given, will lie within 1.966 of the mean, is 95%. 

The inequality expressed in (6) forms the confidence interval 
which is the basis for the test of a hypothesis involving the mean of a 
normal population, 

Symbolically this null hypothesis, Hg, can be expressed 

(8) Hg :A =*X%o 
and the altemative 

H, :A4 &Mo- 

The unit normal deviate is calculated from (1), if |w{ > 1.96 
the hypothesis, H,, can be rejected. The risk of rejecting a true lit 
(TYPE I ERROR), or the level of significance (~) is .05. It should be 
noted that any other level of significance could be selected simply by 
selecting a different value for wy from the table. As an example, if 
w were 1.64, would be .10. 


The area y, shown in Figure 3, varies directly asm. The 





10 
shaded area, 28 is in this case equal toa . This region is known as 
the critical region (Ostle, 1963). If the value of the test statistic 
used ina particular test of a statistical hypothosis falls in this 
region, the hypothesis is rejected. 
If a=2yY , the test is said to be two tailed; that is, the 
hypothesis was rejected either if 
w> 1.96 
or 
wi =1.96. 
The test can also be formulated such that the hypothesis is ree 
jected only if 
w >1.96 
or if 
we -1.96, 
In each of these cases tho probability of a Type I error (rejection of 
a true hypothesis) is the area under only one tail of the curve and 
(9) a> i 


Figure (4) shows the critical region for the one-tailed test, 





CRITICAL 
RE ¢IoW 





Figure 4 
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A test of a statistical hypothesis based on oither extreme of 
the distribution is a one tailed test. It is interesting to note that 
for any symmetrical distribution a one tailed test at 100(% )% signif- 
icance level is equivalont to a two tailed tost at 100(2«)% Signi fie 
cancé level. This is true since the same critical value of w applies 


to the two cases, 


2,11 Tost Procedure 


In order to test the hypothesis 
Ho *#=Ho B= NUE) 
against the alternative 
Hy tH=K, 
at a significance level a, the following steps would be followed: 
(1) Select a level of significance a, and find the corre- 
sponding wy from the appropriate table. 
(2) Compute the unit normal deviate 
wis Xi, 
(3) Reject the hypothesis ie > Wy - 
For example, a set of observations has a sample mean X of 1,50, and a 
variance, 67, of 0.25. Could the true mean, i be 2.00% The hypothe- 
sis to be tested is 
Ay ites 2.00 
If « is selected to be 0,05, the tabulated value of wy, is 1.96, 


w=1,50 - 2.00 = .5= -1 
a x 


121.96 
Therefore, the hypothesis can be accepted at the level of significance 
Selected, 





iW 
The testing of a hypothesis such as this has very little 
practical sicenificance In geodesy, since Ehe variencens, is seldom 
known. 
This test has been discussed in great detail because the prin- 
ciples involved apply to all statistical tests, regardless of the 
statistic, or distribution used. 


2.2 Testing with a Sample Mean 


The sample mean of a set of observations is defined as 
n 


Ko Le) 


n i] 


where n is the number of observations. 
Statistical theory states that the sum of normally distributed 
random variables is normally distributed (Hamilton, 1964). 


The density function of X is 
B&-= +(f Fexp -n(X ae 
5H) 2 
The standardized variable is then 
w= (X ~ dyn? 
Following the same method used for Eaten observation, the proba- 
bility that the deviation of the sample mean exceeds a specific value is 
PCM ye cheat a = 2F(w,) -1 


To test a hypothesis from a sample of n observations and a sample mean 


of xX, 

Hg : A= 4G 
against 

Hy sM¥ LY, 
compute 


f Ww = (X ay ne 
ape aaa 





hs 
The hypothesis is rejected at the 10007 level if [gies Wink 
With this test a confidence intcrval is established around x 


The probability that the true mean, v, lies within this interval is 


ROG ais 242 Xt) : 
ml /t 


This test would be useful to determine if a new set of obser- 
vations is part of an established population with a mean y, and a 
Standard error of G. 

It has been shown that the probability of a Type I error is 
the level of significance chosen for the test. There is also the 
possibility of committing a Type II error, that is the error of accepte 
ing a false hypothesis. 

The probability, /, of a Type II error is dependent upon the 
specific alternative hypothesis which is presumed to be true 

H,) 3: AM. 
The probability of a Type II error is 

(10) = P( Wee, wie Way)» 

when Hy is true. From (1) 


Ww @ (X ~4))né, 


oy 
If the alternate hypothesis is true, Wy is distributed as the 
unit normal deviate, and 
L ome 
(11) Ww) ™ n2(X “No tKXo -A)) 
= i, 
Wi Bt Wt (G -Ay)n*. 
O- 
This can be written as 
(12) Wo™ Wt Ky -u nee 
oS 


Equation (10) can then be written as 


L 
(13) B= PC mwycw, + GY “MH g)n* £ uy) 
epee 





id 
(14) b- Plewis “LL “Mel BA "CHa ura) 
It should be noted that as (A(,-4,) becomes smaller the proba- 
bility of not rejecting a hypothesis when it is false is nearly as great 
as not rejecting it if it is true. The power of a statistical test is 


defined as 1 -5. The power of a test increascs as </, -44, becomes 
oS 


larger or, as the number of observations increases (Hamilton, 1964). 


Figure 5 shows the power of the test for w = .05 and .Ql, as a function 


of do whered = Cd, -A)né. 
cS 


Boy soe 





5 


- 5.0 Oo tT 5,0 


Figure 5 

Note that 1 - # varies directly asa, The value of Ais 
determined by the value chosen fora, If the critical region is ine 
creased, # decreases, Concerning this problem Graybill (1961) states, 

"We should like to minimize the likelyhood, or proba- 

bility, of making either of these two errors. However, 

in general, for a fixed number of observations, if we 

decrease the probability of making an error of one type, 


we increase the chances of making the other." 


Power curves can be found in Table A-l2 of Dixon (1957). 
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Using the t distribution tho test 
Ne t=, 
against 


Hy (AEA, 
can be carried out, 
First compute 
ee none 
S5 SX 
Reject the hypothesis at the ~ levol of significance if 
ie ineiias 

This test is valuable to the geodesist attempting to deters 
mine if a new set of observations is from the same population as 
previous observations with a mean of X and a variance estimate ace 
This problem often arises in the weighting of observations in an adjuste 
ment, 

The following problem will serve as an example. 

In his evaluation of the Laser Theodolite, Dunn (1966) 


observed the horizontal angles between two targets at various current 


levels, At 35 “ amps the angles obtained were (Dunn, p 52): 
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Left Right Moan Residual 
pee 22.2 Bee 00" 29" 6 SCL UE -5.0 
ieee? vs) 061.30: 4a0 Ooo 7 al -3.9 
D7 00e 3 06 43 .6 06 52 .0 -9.0 
US ieee 732 06 39 .8 O7meot 65 33-5 
07 18 .4 06 33 .8 06 56.1 =lbeG 
Oy Adee! QO9EZ7 -.o 06 50 .4 -10.6 
3° 07! 26" 8 3° 06! 35'.3 Bo 7 O10 
[vvJ = 1379.83 — = s£ = 276.0 Soe + 1645 


The t test may be used to compare the mean of one set with a 
grand mean of previous sets, 
From the previous two sots of observations the mean is 
20 A amps 3° 06' 51",1 
30 XY amps 3° _06' 52" 4 
mean of 1 & 2 3° 06! 51",6 
Test the hypothesis 
Ho : M435 = 30 06" 51.6 
against 
By : AC 35 $ 3° 06" 51,6 
t = (3° 07" 01.0 = 3° 06! 51".6) ( 76) 


a= by 2.4 = 0.990 
en 


ts, 5% = 2.57. 
Since t € ts. 5g, this hypothesis can be accepted at the 5% signifi- 


cance level, 
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2,4 Tests Involving Variance 
The statistic % © was introduced by Pearson, It is defined by the 


density function 


‘ / - Ve V/2 -| 
(17) PK" )= 24 F(a) oe, Ge) 
for X*20 othorwiso D(™%*)=O. 


It is also know that the density function for the variance estimate 
s* from a sample size n of a population with a true variance G~* is 
(Hamilton, 1964) - A , 
G(s?) = Fay (se) exp|- FS, [cs*) Jz-4 


for s?>0o otherwise Pst) =o. 





Setting oe these two donsity functions are identical, The 
value of YS, is distributed as%*, 

The value of X‘such that 

PCR > Koa) 2x 
are tabulated in Anpendix IIT. 

ThoX* statistic can be used to test whether the true variance 
estimated by s* is equal to some variance 6,*. The hypothesis can be 
stated 

He 2 "meses 
Hy : G6," , 
To test, compute %* = marae 
° 
The hypothesis is rejected if 
2 
33 q Vv, 1=o7> 


or 


2 2 
xX ge XS V,% 
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The uso of this statistic to test hypothesis conceming 
geodetic and photogrammetric data is very limited. To computo 
the statistic, ¢* the truo variance of the population must be known, 
This quantity is rarely known for the types of data analyzed in geodesy, 

One application of the Chi-square test would be in trian- 
gulation, The square of the desired standard error of the net could 
be considered as the true variance, ©*. The Chiesquare statistic com- 
puted at each station in the not, at the time of observation, could 
then be used as a criterion for acceptance or rejection of the obsere 
vations, 

The “X* statistic can be used as a test of distribution, To 
test the distribution, observations are grouped into n groups accord- 
ing to their values, The expected number of observations in each 
group is computed on the basis of the assumed distribution. Let n, be 
the number of observations actually found in the kth group and dp, be 
the number predicted by the assumed distribution, The statistic 

oyee a G@aped.)- : 
k=l dx 
is distributed approximately as Roni h each dp is at least 5 (Hamilton, 


1964), 


Ze) ine F Statistic 


The F statistic or variance ratio, first introduced by Fisher, 
is a useful statistic. 
F is defined 
(18) Fey * (ihe) My2/g9) 
whore Yy is distributed as X*with v, degrees of freedom, and Y2, 


INDEPENDENT of Y¥4, is distributed as%X"with », dogreos of freedom, 





20 
The probability density function 315 given as 
2 Be 
g(r)= TCY +VI/2) (12 YF (+ vay e 
LOZ iva V2 
for F>0O otherwise %(F) =0. 


It was shown previously that 


[ae oS 


oe 
For two samples from two normal en ulliei ons 34° is distributed as 


G,* 
G2? Fis , Ve 
Thus the hypothesis that the samplos were drawn from populations with 


identical variancos 


can be tested against 
‘ ; 2 2 
Hy 16; 46; 
compute 
2 
F=s, /s, . 
The null hypothesis is rejected if the computed statistic falls 


in the selocted critical region at either ond of the distribution 


POE We, cc/2 (Dixon, 1957). 

The inequalities shown above depond upon the definition of x , 
In some literature they will appear reversed. 

Tables of the percontiles of tho F distribution can be found 
in handbooks and statistics texts such as Dixon and Massey, Table A-7c, 
Appendix IV shows a sample of such a table. 

As an example of the F test, the data takon by Dunn (1966) and 


sprinsky can be analyzed, 





ay 
Lt. Dunn made 34+ observations of a horizontal angle with the 
Laser Theodolite, 


The values wero 


3° 06' 561.4 
06 44 
06 50. 
06 5! 
Ge tT © 


RRR 
— OW 
nos) 


(ee © © © 
ON ON ON nn 
dO WwW Wn Qn 
N © nd CN 
es o e e 


© 

ON 
Nn 

©o 


= ee © © 
=) On 
— © 
SoesdsS 5 
NO WMO QAWWW FR FO TR OR ONS DWUINNUNAORPWOON DAN ALS 


© 
ON 
p> 
© 
s 


Mean of 34 observations 3° 06! 48',3 
Standard deviation + 12",4 


Sample variance 15326 





Ze 
Using the same equipment Capt, Sprinsky obtained the following 


5° 27! 19" .9 
26 58 .4 
27 00 ,0 
26 ea! 
26 54 
ear 
27 13 
Zr <2e 
26 58 
20 Hi 
27. 607 
As 
26 48 
26 Yt 
26 30 
26 WY 
26 49 
26 45 
26 52 
20mg 
26 38 
sy 59) 
26 38 
267-23 
26 4O 
26 29 
Come. 7 
26 42 .7 


ee © © @ @ @ 
PNrFONM ONN OWN Fen ON ON HW 


Mean of 28 observations be 26' 50". 37 
Standard deviation + 14",3 
Sample variance 204.5 
Were the true variances of these two sets of observations equal? 
As a hypothesis this question can be stated 
Hyg 2G) "69 


and can be tested against the alternative 
Hy toy + Go 
The test statistic is computed 


F = 204.5 = 1.33 
F550 
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From a table of percentiles of F distribution, such as the sample in 


Appendix IV, the values of F aro found to be 
P26 see = ite? 


P28, 35, 97.5% = 2-01 - 


oince 


FP? F28, 35, 2.5% 
and 


Ra eee o7se | 


we can accept the hypothesis at the 5% significance level, It can then 
be concluded that tho observations of Lt. Dunn and Capt. Sprinsky have 
the same true variance, 

The following examplo will demonstrate another use of the F 
test. 

In his study of the Kern DKM 3 theodolite Abby (1965) measured 
angles with a Wild T~3, and tho DKM 3 using the center wire and the 5 


wire field, Theso rosults wore obtained: 





Instrument Observations Wires 5 s© 
1 7? 16 1 0" .99 0.980 
2 T=3 16 1 ~78 608 
3 DIM 3 15 5 47 ead 
lb DIM 3 10 5 Ay 194 
5 DEM 3 10 1 CG 436 
6 DIM 3 30 5 61 2372 


Tho F statistic can be used to assist in drawing conclusions; 
(a) Are the 5 wire observations significantly better than the 


1 wire observations with the DKM 3? The hypothesis to be tested is 
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se 
ioe Sy =e) 
For this example a one-tailed test will be used, The null hypothesis 


can be rejected if 


532 F(t ~-x) (ny - 1) (np - 1). 
S2* 
Comparing sets 3 and 5 


P= Ue = 1.973 
eect 


© 


"(1l, 9) 5% = 3.00 
The difference is not significant at the 5% level. 


Using sets 4 and 5 


r= a= 2.247 
0,194 


"(9, 9) 5% = 3.18 
The difference is not significant at the 5% level. 
And using sets 5 and 6 


PS re el 72 
0.372 


(15, 29) 5% = 2.03. 
Again the difference is not significant at the 5% level. 
(b) Are the 5 wire, 30 observation sets significantly better 


than the 5 wire 10 observation sets? 


f = (Gece = 1,918 
0.19 


(29, 9) 5% =O 
The difference is not significant. 
(c) Is the DKM 3 significantly better than set 1 with the 


T+3? 





2) 


F = ,980 = 2,247 
436 


2 
Fc45, 15) 6G = 26/1 
Again the test shows that the difference is not significant, 
(d) Is the DK 3, 5 wire procedure significantly better than 
the T-37 


F = 0,980 =4.4% and F-= .608 = 2.74 
Ceci. ee L 


Ris ome oo tt 
The tost shows that the DKM 3, 5 wire procodure is significantly better 
than the T-3, 

This problem also shows that statistical tests are not in- 
fallible, From parts (a) and (b) ono can conclude that the dif- 
ference in procedure is not significant. Part (c) draws the con- 
clusion that the 1 wire DKM 3 procedure and the T-~3 are not sig-= 
nificantly different, yet (d) concludes that the 5 wire procedure 
is significantly different, 

Statistical inference must be used with caution. Judgement 


must also be used in drawing conclusions, 


2,6 Testing Correlations 
The distribution of the sample correlation coefficient, r, 


is very complicated, Often in geodesy it is valuable to know if 
the true correlation coefficient, p, is zero, If P is equal to 


zero the statistic 
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is distributed as Student's t with (n-2) degrees of freedom (Guttman, 


1965). 
To test the hypothesis 


against 


= 

Ly 
the 
° 


compute the statistic 


If C>tn-2, «/2, the hypothesis may be rejected in favor of the 
alternate, 
To test the hypothesis 
Ho : P=9,, where 9, £0 
we must know the distribution of f when 9 # O. In most geodetic 
applications it is more important to know if correlation exists, 
rather than if it might be some Specific value, For a discussion 


of the distribution of # O the reader is referred to Graybill (1961). 





ae 
CHAPTER 3 


TESTS INVOLVING MULTIPLE VARIABLES 
So far the tests which have been discussed deal only with ono 
variable. The techniques of Rogrossion Analysis allow the testing 


of hypothesis involving two or more related variables, 


3,1 Regression Analysis 


The functional relationship between some observable L and 
variables can be expressed mathematically 
(iS) eS Oke ree, | rs C7 S.) 
©; is a parameter in the function. This equation is often abbre- 
viated 
L = PUM ek oer oseie sn) s 

To the statistician this equation is known as a regression 
function. The geodesist and photogramnetrist will recognize it as 
the mathematical structure for a method of observation equations 
adjustment problem. 

The mathematical structure of tho problem may bo chosen 
by two methods. In geodesy the analytical consideration of the 
phenomenon involved is the preferred method. The examination of 
scatter diagrams plotted from the observed data can also yield a 


workable structure, 


3,2 Fitting a linear Mathomatical Structure by Least Squares 


Suppose that a linear relationship exists between a dependent 
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variable, y, and an independent variable, x, such as the rolation- 
ship between gravity and height in free air. From observations n 
sots of points, (x45¥4)» (Xoo Yo) ecoees (Xp vey, are obtained, The 
sample structure can be expressed 


ir 


From the observed data we wish to obtain the values of the unknown 
oo 

Let us assume that y for any given value of x is a random variable 
which is normally distributed with a mean a+ 8, x and variance @?, 
We will also assume that 4, and 4, do not depend upon x . The con- 
ditional probability density function of y for a given x can be written 

( 2B a Tachi, See -8,~ 4%)" |. 
The expected value of y, va is 

(21) E(y/x) =£ +h, Xe 
The conditional probability density function? of a set of n obser- 


meetons (X,¥), (Xd, Yo) secee (Xp ie) is 
4 : = 4 pees Yes oo” ai 
(22) Tt (Me) + mw Sera: exe} a Vig &, xi) | 


Gea) evr [ee Fee, “of 


Applying the principles of maximum likelyhood (Guttman, 1965) the 
parameters which maximize the probability density function must be 


found. To maximize this function the quantity 


(23) p= 2 i -fy - Biri) 
cl 





JFor proof see Guttman (1965), Appendix III 





2) 
must be minimized with respect to 8, and B, . The process of esti- 
mating Po and by minimizing the sum of the squares of the residuals 
is the well known method of least squares. The value of /S, and i) 


which minimize F(4,4) are those for which 
(24) FCS, &) =0 
Ces 


dB, 
It should be noted that for a normal distribution, the method of 


least squares gives a maximum likelyhood estimate of the parameters, 
If the error in tho observed quantities is not normally distributed, 
the method of least squares will give a minimum variance estimate of the 
parameters but this estimate will not be the maximum likelyhood estimate, 
Taking the derivitives of (23) we obtain 
(25) -2 2 (¥; “Fo 741%) = 0 
“2 x4 (y4 “Bo -& 1%) = 0 
These equations can be roe Cee as 


n 
Zn +4Ak % 
= Xi; a (2, zx + 6, zx" 

In this form they are referred to as the normal equations, In matrix 
form this equation is the formula 

NX + U = 0, 
If the determinant of Nis not zero a solution to this equation exists, 
The values are 

by = y - byx 


by - (xy = x) (y; = y) 
(xy - x)? 


where by and b, are estimates of B, and 8 (Guttman, 1965). 
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Graphically all values of x and y could be represented as a 
straight line in the x, y plane with intercept b,, and slope by. This 
line is called tho regression line. 

The unknown parameters can be found using the matrix oquation 

NX = -U 

or X = -wrly 
the variance of unit weight, ape, (or mo”. as frequently used in 
geodetic literature) is given by 


(28) So“ = V/PV 
n-U 





In the linear regression problem u is 2, From the matrix adjustment 
we also form the weight coefficient matrix, Q,. The variance - 


covariance matrix 2, is formed by multiplying Q,by So“. In the linear 


regression 
ee 2 
(29) 2 io Sbo Sb. b, 
2 
Sb ob, Sb. 


These quantities are the statistics which will be used in the testing 


of hypotheses concerning this regression, 


tests ot otheses ina Lincar Rereression 





In most linear regressions the estimator of greatest impor- 
tance is the slope of tho line, by. To test if the slope is signifi- 
cantly different from some hypothesized value, say 8 ‘the t statistic is 
used, Stating the hypothesis 

Ho : 8 = 


the alternate is 


et AtA 
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the statistic computed is 
(30) t= (by - G4") 


— Se ee 


oA 

The hypothesis is rejected if 

te ti. /2) (n-2) 
or 

C smeriiag(2). (n=2) 
It should be noted that if y is indepondent of x, f, will be 0, We 
ean test for this by setting Be equal to 0, then 

(31) t- = b- - 0 


would be the test statistic. 


Other hypotheses concerning the linear regression are 


(1) se ote. 
(2) Ho 24 = 64" and G5 = Go'. 
For hypothesis (1) the t statistic is used 
(32) t = (by - fo!) 
Sb, 


The F statistic can be used to simultaneously test hypothesis 


(2). This method will be discussed in section 3.31. 





oe 


Table 2 


Summary of Test Procedures 


Hypothesis Statistic Bguation Nejeeri on Region 


[tl 2 Soen-27ct-ay,) 





(tl ae Cin-z36' a2) 


F 34 
(soc. 3.31) 





The mathematical structure discussed is quite general, For 
example 
y= £,+ £ sint 
can be handled by the methods previously discussed simply by sub- 
stituting 
x = Sin t. 


Sv 


The expotential problem u =Z%e reduces to a linear form 
fog t= lop y=dv. 


Making the following substitutions 


va Tor U 
#4, = log x 
=v 


42S 
We have the problem in the form 
eae + ix (Ostle, 1963). 


As one example of the goodetic applications of a linear 


regression data presented by Iaurila (1965) can be analyzed. In this 





be: 
report, Dr. Laurila assumes that tho rolationship between absolute 
humidity and measuring error is linear. The mathematical structure used 
is 
Dt - Do = CA + K 

where D, is the observed distanco, Dy, the true distance of the reference 
base, A denotes the absolute humidity, C reprosents the humidity 
coefficient, and K is an instrument constant, 

From a least squares adjustment of 25 measurements the following 


results were obtained (laurila, 1965). 





Figure 6 is a scatter diagram plotted from the data given in the 
report, for the reverse readings. 

Using the t test, hypotheses about this line can be tested, It 
should be kept in mind that these tests are conditional tests, They 


take into account only the effect of errors of the parameter tested, 





SZ 





Pe 


be 


= 


CeH/46) ALIOIWOH 


Oo? 3! 


ze 


B4unbs y 


bs Z/ 





oO! 





(1) H, 2 C= 0 
t = 0.92 = 9.7 
fo! 


tou, 5% = 2.06 


This hypothesis can be rojected, 


(2) Ho : K=O 
t=11.4=9.5 
lec 


toy, 54 = 2.06 


This hypothesis can also be rejected, 
From a combined adjustment (Iaurila, 1965) the parameters were 
found to be 
C = 0,66 
K = -6,8, 


Were these the true parameters for the line given by the forward readings 


only? 


(3) H, 3 C= 0.66 
t = wo on Ne = ool = pel 
.10 .10 


toy, 5% = 2.06 


This hypothesis may also be rejocted, 


(4) H, 1 K = 6.8 
t = 114.4 - GAG = 4,6 = 3, 84 
42 2 


Again this hypothesis may be rejected, 
In a similar manner the lincar coefficients obtained from a 
least squares adjustment may be tested for a zero value or some 


theoretical or previously determined value, 
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3.31 Matrix Approach 

The simplest way to handle adjustment computations is to use 
matrix algebra. It would be valuable to have matrix mcthods to test 
statistical hypothesis conceming the results of an adjustment. Hamilton 
(1964) has given such methods, 

From an adjustment computation, using the methods outlined by 
Dr. Uotila (14906) the variance-covariance matrix®, the solution matrix 
X and the estimated standard error of one observation of unit weight 
ae can be obtained. From the derivation 

= S57 Qx = 5,2 NT) 
Hamilton (1964) shows that 
(X29) eee 

is distributed as 


Ke =Fu,n-u 
Gn - u/(n - u) 





by matrix aigebra 


Applying Hamilton's test 


Ho 2 X * Xp 


where Xy is the hypothesized value of X, compute 


(34) Sym l (XK = 41 QE (CK =X) 
uo So" 


If S\ exceeds Fu, n - ugcthe hypothesis, Hj, may be rejected at the « 
TW 


level of significance. 
Hamilton (1964) also developes the theory needed to test 


hypothesis when constraints have been placed on the structure. 
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Introducing Lagrange multipliers, K’, the function 
fp = VIPV = 2K" (CX + 2) 
is obtained, Differentiating this expression 
df = 2V'PdV = 2K'CdX. 
Let X denote the least squares estimate under the constraint 
dg 


Minimizing the residuals 


2 | -U + NX « K'C J aX. 


0 


2{-U + X'N - K'c J dx 
SO 
K'C = -U't + X'N 
Substituting 
Ul = x2" 
where X* is the best least squares solution without conditions, the equa- 
tion becomes 
K'C = (X = X*)'N 

KCN7! a’ 2 Cree Xxt ct = Zt. yacr 

thus 
K =( 2) ae C51 Ch lm 

eliminating K from the preceding two equations we obtain 

Be -1  -l 

CX = en =2 7 RAC CGN «6C'} 6 
or 

= = wi =i 
a5) X! = KE) (ZC (Ch Ch) CN 
The weighted sum of the squares of the residuals is given by 
Rg = VIPV + (X = X*)'(A'PA) (X = X*) 

Vis the matrix of residuals without the constraints. 
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The expected value of Ro is (n = u + b) G*° where b is tho nunber of 
conditions. The expression may be rewritten as 
Ry = Rg = Ro 
where 
Ry = V'PV 
That is, R, is the unconditional least squares sum and Ry is the addi- 


tional sum of squares due to the constraints, Hamilton (1964) shows 





that 
Ry = (Z = CX*)! (c(A'PA)@Ict 7+ (2 - CX*), 
The ratio 
Ry = ia - Ry 
Ro Ro 
is distributed as 
o b ? 
neu 


Substituting the values of Ry and Re 


Ry = (2 = CX*)! (o(a'PA)mi¢1 7? (Z - CX*), 


ae 
Ry (Hee u)s 


but 





the variance-covariance matrix. 


Thus 
(36) n= u Ry = (Zewem)" (crc) (Zz = ox) 
b Ry b 


which is distributed as Fh, nu. The hypothesis can be rejocted if 


the computed value of this quantity exceeds the tabulated value of 


By, n-u, ox . 
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If a theoretical model has parameters x? the hypothesis 


T 


H X= X 


90: 
that is 


ee 


At 
Xo ™ XO 


The matrices C and W are 


T 
uc = I and 1 m= X 


thus the statistic to be tested is 


(x = xT)1 Qt) (xt XT) @ Sy 
us 4 7 


This is a more general derivation of the statistic offered earlier. 

To test a hypothesis concerning the value of a single parameter, 
regardless of the values which the other parameters have, the same pro- 
cedure would be followed. 

° = T 
eee X) xX) 
C= (lL, 0 30eee... o> 
T 
We xy 
Then if the hypothesis is true 
(n = u) Rien es x, 1)? 
Ro Sli 


where s,° is the variance of x,, the statistic (n - u) Ry 


Ro 


is distributed as F), neurA, This is simply the square of the 
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Student's t for (n - u) degrees of £reedom. 
For a more detailed discussion of the matrix approach the reader 
is referred to chapter 4 of Hamilton (1964). 
To illustrate the matrix approach, the data of Hatch (1964) 


will be used. 


15.00 0.00 -0.10 -0.01 0.01 -0.20 -0.03 


0.00 7.66 0.13 -0.20 -0.01 0,02 0.03 
-0.10 O.13 7.42 0.05 0.00 0.02 0.01 
N= |-0.01 -0.20 0.05 f2OL 0.00 0.03 0.00 
0.01 -0.01 0.00 0,00 0,03 0.00 0.00 
-0.20 0.02 0.02 0.03 0.00 Diese 0.00 
-0.03 0.03 0.01 0.00 0.00 0.00 0.27 | 


Other data are 


s = 2.44 4.05 
gt = 5,95 -2.46 
um 7 be 

n= 15 X* =| 1.74 

1.00 

-5.30 

[-0.12| 


For the sake of an example, assume that some theoretical consid- 


erations predict x? to be: 





We wish to test the hypothesis 


The computed values are 
0.39 
~2.46 
=i, 1a 
(Gee x)= ae 
1,00 
~5.66 
~0.12| 


= (xs — x1): See 158.79 = 3.81 





= | to 


us 44.65 
From the tables for F (Appendix IV) 
Fo, 8, 56 = 3-90 
Since Sy Zz Fo 8, 58° the hypothesis may be rejected at 
“u 
the 5% significance level. Rejection at this level of significance is 


termed significant. Can the hypothesis also be rejected at the 1% 
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level? 
E]y Sola Coe 


ie *7, 8, 1%, 8° the hypothesis may not be rejected at the 1% 
u 


Significance level. 


3.4 Complex Models 


In many cases a simple two dimensional linear mathematical 
structure will not properly represent the given data, [It may be that 
a polynomial of increasing order 

Y=fo +4, + Box," + coeee JX 
will better represent the true structure. The matrix least squares 
solution in procedure is identical to the previous case. Using the 
solution matrix (b,, b, ... b) and the weight coefficient naeete (Q) 
statistical hypothesis may be tested using the t tests previously shown, 

The hypothesis 

Hy § Ay = % A= Bi'+ 8. = By 20+ 8,94," 
is tested by computing 


804441 54 
and tested against 
©(1 =a/2)(n =u), 
The hypothesis may also be tested by 
F=b,? 
857444 


These F tests serve to assess the significance of the additional 


reduction in the residual sum of squares achieved by fitting b's in the 
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particular order adopted, Tho erder ef fit is important, In 4 
polynomial this ordor is fixed, In ether structures tho erder is doter- 
mined by the equatiens ef the mathematical structure, 

Then fitting pelynomials Ostle (1963) recemmends, 

"Rather than seok a bettor fit in terms of a 
higher degree polynemial (i.e. a degree greator 
than 2) it is probably better to cast abeut for 
some other functional ferm to represent the data," 

An excellent example of regrossien analysis applied to a gravity 
problem is given in the article by H, Welf (1965). In this paper Dr. 
Wolf uses hypothesis tests to determine the systematic trend of gravity 
differences aleng the Eurepoan Calibration Line, 

It is evident that the matrix approach discussed in Section 
3.31 can be easily applied to complex models. 

Application of hypothesis tests to a non-linear model raises 
another interesting problem, It cannot bo shown that the non-linear 
least-squares solution will always converge to even &@ local minimum 
value of the woighted sum of the residuals, V'PV. This problem is dis- 
cussed by Hamilton (1964). He concludos, 

"If the estimated errors are small onough that 
the functiens are truly linear ever tho range 

of several standard doviatiens in oach parameter, 
the methods ef testing linear hypothesis .., 

can be applied in the same way," 

When individual parameters are tested, either by the t or F 
tests previously demonstrated, any co-variance between tho parameters 
is not considered, If the parameters were truly indepondent the choice 
between individual tests, i.e., the t test, and the simultanceus 
matrix tests would be ene of individual preference, 


If the co-variance between paramoters is largo, the simul- 


taneous matrix test should be used, 





Example: 


dlp 


In his work with the MRA-1 Tollurometer, Hatch (1964) 


investigated the mathematical structure 


After least squares adjustment,results 


T-B = by + bo sin (2 Wr (A + bs)) + bz sin 


100 


+ (bor (A + bg)) + by sin (69 (A + by)). 


100 


b(1)= 
b(2)= 
b(3)= 
b(4+)= 
b(5)= 
b(6)= 
b(7 )= 


He made 16 observations thus 


degrees of freedom, 


To test 


ty 


is computed from 


(n - 


Zee 

0.32 
10,09 
@2.57 
-0,31 
23.02 
-0,06 


100 
for January 19 were: 
Ve 
1553 


[+ 


[+ 


1.92 


[+ 


1.80 


[+ 


89.63 
1.45 


(+ 


[+ 


age 


[+ 


-7=9 


25 


0.175 


56 54 


1.43 





is, 


t, = 0.31 = 0.0035 
te = 23.02 = 15.90 
t7 = 0,060= 0.0206 


from the table 

tg, 5% = 2.20% 
thus the only parameters which are significant at tho op level are b3 
and bé. 


At the 1% level 


ty, |e Beco 
therefore by, b3, and bg are significant at this level. 
These tests at a significance level of 1% give statistical sup- 
port to the conclusions drawn by Hatch (1964) that the error can be rep- 
resented by 


B= by + by sin 47r(A + bg). 
100 


To go a step farther with the data given by Hatch, the hypothe 
esis 
H, } B. = 0, i = 1 to 7 


was tested on each set of observations using the t test. Table 3 shows 


the results (Hatch, 1964), 
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Table 3 


Tést of Ha : B. = 0 


JaAte jumber o 
Parameto a March April Rejections 





B, R 3 
Bo A R R 3 
B., R A R 4 
By A A A 1 
Bs A A A 0 
Be A A R 2 
By A A A 1 


* A signifies acceptance, R signifies rejection, 
If the parameters, which rejected the hypothesis more than once 
in the six sets are used, the formula for the cyclic zoro error would be 


E = by + by sin 27 A + by sin 4ir (A+be) . 
100 100 


As a comparison of the individual t test and the simultaneous 
matrix test consider the following data taken on 23 March by Hatch 
(1964). 

Neate oe a= 7. 85 = + 3.03, 5. = 9.19 
B(1) = 3.66+ 0.65 
B(2) = -1.67+ 0.87 
B(3) = 6.83 + 0.93 
BO4) = 1.75+ 0,88 
B(5) =Sezieas = 15239 
B(6) =V0rse + 8641.32 
BO?) = 1.634 4.79 





25.00 -2.95 -0.26 Peo. se0.27 1 e000 0nd) 
=2.95 43,972 U2ee23 Geeme2 1.60 0.17 
=0,26 S22, 28 Se 1,62 ath ee Ono een O, dic 0.14 
fee (e258 «22,23 a8 13.05 one 02 eeeo. 14 
-0.27 -0.02 0.01 0.01 0,04 -0,08 -0.01 
ene. 1.60 6.48 SHikoe 20-0smarc. 160) 820,25 
Wont 0.17 OAs oe some = 5 0.42 | 
The inverse matrix is given as: 
0.05 Oot 0,00 e0nOlN 0,32 990,02 =0,00 
0.04 0,08 0,02 ORO O05 meG.02  -0200 
0,00 0,02 0702 ORO Mm= 0701 O Oley 0.04. 
n~l= | -0.01 0.01 0.01 0,03 0,00 0.02 0.03 
0,32 0.05 -0.01 0,00 25.89 0,45 0.72 
0,02 "%=0,02 §-0.01 DR0r02 on sce) 0.14 
0.00 -0.05 -0.04 0.03 0.72 0.14 2,51. 


From the previous t tests one would expect that B(4), B(5), 
and B(7) might be equal to zero, Constraining all values to be equal 
to their test value, the C matrix is equal to I, and Z is equal to the 
test matrix, x?, The hypothesis 
. H.: X= X 
where oer 

-1.7 
io 6.8 
Ono 
0.0 


0.4 
| 0.0 
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will be tested, Using cquation (36) 


Sys (x#-x! en (xe xT) = 4,697 = 0.695. 
u us 7)(9.19) 


oO 


Since S, £ Fo 4g, 6g =2.58 , 
=a «e 


the hypothesis is acceptable. The values of B(4), B(5), and B(7) are 
probably equal to zero, which is the same conclusion reached by the t 
test, The values given for B(1), B(2), B(3), and B(6) are not noces- 
sarily the final valuos, The new structure, containing only these para- 
meteors, must be solved by another least squares adjustment. for the 
data used previously the results of this readjustment and the change 


from the values given by Hatch are 


Value Change 
B(1) Sc ed 
B(2) -1,80 el 
B(3) Cus pat) 
B(6) 0.28 ei6 : 


Statistical tests could also be applied to these values, 
The previous example could also have been tested using the "R 
factor" test discussed by Hamilton (1964). This is a test of the ratio 


R 
a For the details the reader is roeforred to page 157 of Hamilton, 


R 
Oo 
To point up the difference between the individual tests and a 
simultaneous test, the data of Laurila (1965), shown in Figure 6, Section 
3.3 of this thesis will be used. 


For this example assume that some theoretical consideration 


predicts that the values are 





Using the t test 


t 206 = 2806. 


Cf = io- 
0,10 
From the tables, toy 5¢ is 2.06; therefore, these valuos of K' and C! 


are acceptable, 


The values of K' and C! can then be used as the test values for 


a simultaneous F test. 


(x* - x!) = 0.454 
~4,70 
F=[0.454 4.7 | 25.0 el i” 
272.4 3824.28] | Ht, 
(2)(0.2129) 
F = 195,683. 


The tabulated value of Fy 4g 5¢ is 2.58,thus the hypothesis can be re- 


jected although the test values are acceptable on an individual basis, 


ol 


One would expect this, because the Q™” matrix indicates a strong corre- 


lation between parameters, 
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CHAPTER 4 


LEVEL OF SIGNIFICANCE 

The selection of the significance level to be used in a 
statistical test is a matter of judgoment and oxperionce on tho part of 
the geodesist. It may be chosen before the data is analyzed, Basie 
cally the problem is; how large should tho critical rogion for rejoction 
be, or what is the risk of committing a Type I error which the geodesist 
is willing to accept? Dixon (1957) states, 

"Tf it is a mattor of great concern when a true 
hypothesis is rejected, ~ should be small ...... 

if it is a matter of great concern that a hypothe- 
sis be rejected if thero is little evidence against 
it we should use a large ", 

A convention followed by statisticians is the following: If a 
hypothesis is rejected at X= 54 it is said to be significant, A 
hypothesis rejected at X = 1% is said to bo highly significant, 

When chosing a significance level for a test, one must keep in 
mind that the acceptance of a hypothesis is favored over rejoction of an 
alternative by any tost,. If an @M of 1% is chosen, the critical area 
for rejection is small, thus rejection is less likely than if a 5% 
level of significance is chosen, 

The choice between & = 5% and & = 1% must be dictated by the 
circumstances surrounding the problen, 

An entirely differont approach to hypothesis testing can be 
developed by using a slightly different procedure, In this method the 
significance level is not pre-selected, The test statistic is computed 
by one of the methods previously aiscussod, 


The probability (found in the tables) associated with the value 
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yielded by tho data is taken as an objective moasure of tho dogreoe of 
support that the data londs to the hypothosis, Taking this approach 
the test of significance does not lead to tho acceptance or rejection of 
the hypotheses; it merely moasures the strength of boliof. In many 
problems encountered in goodesy and photogrammetry this may be the 
better approach, 

Example: As a final step in his analysis Hatch (1964) rep} 
resented each of his regression coofficients by a linoar rogression 

by = My o + Cy 


b 


> My P + C3 


b6 
Whero eo is the partial vapor pressure, and P is the baramotric 


Me e + C¢ 


pressure, 


By a least squares adjustment he obtained the following: 
(p. 66) 


Standard Error 





The t statistic can be used to determine the significance of 


each coeefficient 





Dogree of Freedom Degree of Support 





It can be concluded that there is little relationship botween 


b, and P since M3 is significant at a level of less than 50%. 





2 
CHAPTER 5 


CONC LUST ONS 

This thesis has presonted tho statistical theory basic to 
hypothesis tests, Some of the statistics commonly used in hypothesis 
tests have been discussed, and their applications to the problems of 
geodesy demonstrated, 

The tests most applicable to geodetic problems are obviously 
the ones which use a statistic that docs not require the true variance 
in its computation. Thus the Student's t statistic and the F statistic 
are more useful, 

The t statistic is used to test the mean of a population with 
an estimated standard error, s, and a finite number of observations, n, 


against some other mean value, 


Fisher's F statistic is given as 
F=s,° 
=. 
It can be used to test hypothesis concerning the variance of two inde- 
pendent sets of data, 

From the adjustment procedures outlined by Dr. Uotila (1966), 
all the values needed to compute tests statistics are available, In 
nost oe computations of the test statistic need only be carried to 
three significant figures, This precision can be obtained on an 
ordinary slide rulo, 


By using matrix methods the dosired test statistic can be com-~ 
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puted, during the adjustment solution, by an cloctronic computor, The 
simultanoous matrix test is prefoerrod over the individual t test in 
cases whore there is a largo correlation between paramotors. The person 
analyzing the data need only compare the computed statistic with the 
tabulated values to make the hypothesis tests. 

The purpose of using statistical hypothesis tests in geodesy and 
photogrammetry is to guide the user to the bost conclusions based on 
the data analyzed, It should bo emphasized that failuro to roject a 
hypothesis does not mean the hypothesis is true. If, on tho basis of a 
test, a hypothesis is rejected the statement can be made that there is 
evidence, from the data analyzed, that the hypothesis is not true, 
Conclusions drawn with the aid of statistical tests are thus supported 
by probability theory, in addition to tho judgement and experience of 


the scientist, 
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APPENDIX I 


CUMULATIVE NORMAL DISTRIBUTION 





Cumulative Normal Distribution 


Appendix I 
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Taken from Appendix, Table I (liamilton, 1964). 
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STUDENT'S t DISTRIBUTION 
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Used in conjunction with problems in statistics, the table of 
the Student's t distribution permits the evaluation of deviations ex- 
pressod in terms of astimates of standard errors for samples of various 
sizes. A given estimate of standard error is divided into a difference 
or deviation, to obtain t as a basis for a test of significance. The 
table is entered with the number of degrees of freedom determined for 
the problem, The tabular entry in the column is the value of t 
associated with the probability level indicated at the top of the 
column, This level expresses the probability of obtaining a difference 


as large as the one obtained due to chance. (Arkin and Colton 1959). 
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Student's t Distribution 


Probahility 
Degrees of 
1 6. 63° 66 
9 2 9 92 
3 Me ie ; Hohl 
4 » pip 3. 4 60 
6 us o: 3. 1: 3.71 
: i 2 3, 3 50 
8 ie o c. 3.36 
9 i. De. ee Oo. 20 
10 1. a a 3.17 
11 i 2 oe Sal 
5 1 2. 2.6 3.06 
13 1. a a 3.01 
i4 ie ale Zl 2.98 
15 1. oO Pa § 2.99 
18 1 2. 2. nea 
19 i ae 2a 2.86 
99 1 PA, Ze oe 
26 1. 2. Zon ete 
99 1 9: (); D Zab 
AQ ie oe aoe al 
45 1. oF ae 2 69 
50 1. 2: 2. 2.65 
90 1 & 7 oS 2.00 
100 1. 1. oy 2.63 
150 1.6 1. 2 2 61 
300 1. . et ee a 
400 1 ¢ 1 ¢ a, 2.59 
500 l. 1. oe 2000 
1000 Le ly 2. 2.53 





* The grester portion of this teble taken from R.A Fisher's “Statistical Methods for Research 
Workeze,”’ with the permission of tho author and bis publishers, Olver and Boyd, London 


Source: ee ce by permission from C, Hf. Goulden, Methods of Statiatical Analyse (New 
York: Jobn Wiley & Sone, 1939). 


Taken from (Arkin and Colton 1959) 
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APPENDIX III 


TABLE OF CHI-SQUARE 
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The table of Chiesquaro is cntcrod with tho degrees 
of freedom appropriate to the problem. The row for the specified 
degrees of frocdom is followed across to the columns corresponding to 
a«/2 and 1 = o&/2 where tho thooretical valuos of Chi-squaro needed for 


the test are found, 
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Percentage Points of the y? Distribution’ f 


Values of yin, Where a is the probability that 4? execeds x2, a, and 


zn ee 
i: att) aye eae 


i I a 





O08 O.000 O75 O.950 0.500 0.050 Q.025 0.010 0.005 


0004+ 0.004 0.004+ 0.004 0.45 3.81 la 6.63 ios 
0.01 ue 0.05 0.10 leas? 5.99 7.38 N21 10.60 
0.07 O.11 ee? 0.335 wd 7.81 935 11.34 12.81 
0.21 0.30 0.18 0.71 3.36 949 I1.11 13.28 9 1-1.86 
0.11 0.55 0.833 1.15 $35 11.07 12.88 15.09 = 16.75 


0.68 0.87 Lew 1.64 9309 12.59 14.15 16.81 18.55 
0:99 ] 24 1.69 2.17 6.35 14.07 16.01 IS.48 20.28 
1.31 1.65 2.18 i 7.31 ih! [eno 2009 21-96 
1.73 2.09 2.40) 3.33 8.3:1 sO. 2 IEG? 23.58) 
pl na ea) 3.91 9.34 18.31 “O48 23.21 40505 


2.60 3.05 3.82 4.57 10.31 [UGS ~2192 2172 26.76 
3.07 ae 110 9.23 13h 21.03 23.394 206.22 28.30 
3.07 Td 9.01 Doo ee ieee oe tee 27.69. 29.82 
4.07 1.66 5.63 6.07 3.3 23.68 26.12 29.14 31.32 
4.60 een ti.27 7.260 EES t 25.00 27.49 30.58 = 32.80 


= 


5.14 5.81 6.91 7.96 19.31 26.30 2885 32.00 3:1. 
5.470 6.41 7.00 8.67 16.31 27.59 30.19 33.411 35.72 
6.26 7.01 Oc 959 17.38 | 28:87 31.53 $4.81 37.16 
6.84 7.63 8.91 10.12 18.34 380.14 32.85 36.19 38.58 
7.43 8.26 9.59 10.85 19.31 38bd4l 34.17 37.57 40.00 


T to 


10.52 1.52 13.12 14.61 21.314 37.65 40.65 44.31 46.93 
13.79 14.95 16.79 18.49 29.31 43.77 46.98 50.89 53.67 
20.71 22.16 23> 2a eee o5.76 5081 03.69 9 66.77 
27.990 207) 32300 Sie wee Gro Fly 7015. 79.16 
$9.03 37.18 40418 13.19 59.33 79.08 83.30 88.38 91.95 





13.28 A511 18.76 5b7' 69.38 90.53 95.02 100.42 104.22 
O17 93.51) 57.15 “i030 79.33 10Ss 0b. 911233° 116.32 
59.20 61.75 65.65 69.18 $9.33 113.14 TIS. L24.12 © £28.30 
67.33 70.06 74.22 77.93 99.330 124.34 129.56 135.81 110.17 


* Adapted from the tables prepared by Catherine M. Thompson for Biometrika, 
vol. 32; reproduced with permission of the edilors of Biometrika. 

tf For more than 100 degrees of freedom, pereentage points x2,.@ of the x? distribution 
May be obtained from the two-tailed percentage poin(s Np of the normal distribution 
by (he approximate relation, x’%na n+ (2n)'2Np, witha = P. 

fais thus the probability in one tail of the dis(ribution. 


Taken from Appendix Table III (Hamilton, 1964) 
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TABLE OF F 
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“In tho table of F the distribution of F is tabulated for the 5% 
and 1% residual levels, Tho 54 level value of F is indicated in ordinary 
type, and the 1% level figures aro printed in bold faco typo, Imtry in- 
to the table is accomplished by rofecrence to tho appropriate column for 
the degrees of froodom associated with the greater variance, and to the 
appropriate row for the dogrees of frocdom associated with tho smallior 
variance, If the calculated ratio botwoen the two variances (Ff) exceeds 
the value for F indicated in the body of the table for the 5% level, 
there are fewer than 5 chances in 100 that the disparity between the 
calculated variances is due to chanco; if F excceds that recorded for 
the 1% level, the probability is less than 1 in 100 that the difference 


is accidental.” (Arkin and Colton, 1959). 
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) .9995 | 266 | 237 | 225 | 218 \2 4| 211) 2091 208 | 207 | 206 ee 9995 
| i 


Read .0356 as .00056, 200! as 2000, 




















1624 as 1620000, ctc. 


Taken from (Dixon and Massey, 1957) 
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